Communication-hiding programming for clusters with multi-coprocessor nodes
نویسندگان
چکیده
Future exascale systems are expected to adopt compute nodes that incorporate many accelerators. To shed some light on the upcoming software challenge, this paper investigates the particular topic of programming clusters that have multiple Xeon Phi coprocessors in each compute node. A new offload approach is considered for intra-node communication, which combines Intel’s APIs of coprocessor offload infrastructure (COI) and symmetric communication interface (SCIF) for achieving low latency. While the conventional pragmabased offload approach allows simpler programming, the COI-SCIF approach has three advantages in (1) lower overhead associated with launching offloaded code, (2) higher data transfer bandwidths, and (3) more advanced asynchrony between computation and data movement. The low-level COI-SCIF approach is also shown to have benefits over the MPI-OpenMP counterpart, which belongs to the symmetric usage mode. Moreover, a hybird programming strategy based on COI-SCIF is presented for joining the computational force of all CPUs and coprocessors, while realizing communication hiding. All the programming approaches are tested by a real-world 3D application, for which the COI-SCIF-based approach shows a performance advantage on Tianhe-2. Copyright © 2015 John Wiley & Sons, Ltd.
منابع مشابه
An Adaptive LEACH-based Clustering Algorithm for Wireless Sensor Networks
LEACH is the most popular clastering algorithm in Wireless Sensor Networks (WSNs). However, it has two main drawbacks, including random selection of cluster heads, and direct communication of cluster heads with the sink. This paper aims to introduce a new centralized cluster-based routing protocol named LEACH-AEC (LEACH with Adaptive Energy Consumption), which guarantees to generate balanced cl...
متن کاملEvaluating one-sided programming models for GPU cluster computations
The Global Array toolkit (GA) [1] is a powerful framework for implementing algorithms with irregular communication patterns, such as those of quantum chemistry. On the other hand, accelerators such as GPUs have shown great potential for important kernels in quantum chemistry, for example, atomic integral generation [2] and dense linear algebra in correlated methods [3]. Integration of the globa...
متن کاملHybrid MIC/CPU Parallel Implementation of MoM on MIC Cluster for Electromagnetic Problems
In this paper, a Many Integrated Core Architecture (MIC) accelerated parallel method of moment (MoM) algorithm is proposed to solve electromagnetic problems in practical applications, where MIC means a kind of coprocessor or accelerator in computer systems which is used to accelerate the computation performed by Central Processing Unit (CPU). Three critical points are introduced in this paper i...
متن کاملMLCA: A Multi-Level Clustering Algorithm for Routing in Wireless Sensor Networks
Energy constraint is the biggest challenge in wireless sensor networks because the power supply of each sensor node is a battery that is not rechargeable or replaceable due to the applications of these networks. One of the successful methods for saving energy in these networks is clustering. It has caused that cluster-based routing algorithms are successful routing algorithm for these networks....
متن کاملDesign and Implementation of High-speed Asynchronous Communication Ports for Vlsi Multicomputer Nodes †
A communication coprocessor that provides highbandwidth low-latency inter-node communication is a key component of multicomputer systems composed of hundreds of computing nodes interconnected by point-to-point links. For high reliability, interdependency between nodes is minimized by using a separate clock at each node. Thus, the coprocessor must handle asynchronous inputs with a very low proba...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Concurrency and Computation: Practice and Experience
دوره 27 شماره
صفحات -
تاریخ انتشار 2015